“Computer Vision algorithms enable machine to indentify and classify objects, then react accordingly. ”

CV & Imaging

Week 1

Lecturer: Prof. Hamid Dehghani

F2F: 12 noon Weds, 3pm friday, 10am Mon-Zoom (Lab- section)

 

Matlab-Based tutorial

Robotic Vision

Content

pdf

Start with what we look up things: to know what is where, by looking. P15~16

image-20220202122445231

prior knowledge (physics etc.) matters

image-20220202122613036

Evolution of Eyes

We see things cuz of light reflect..

image-20220202122917247

different fre light react diff with material, thus how we collect the info.

Humans perceive elevtromagnetic radiation with wavelengths 360-760nm

f=cλE=hf

E is Eberfy, c = speed of light, λ = wavalength(m), h= Plank’s constant (6.623x1032 Js)

Only capture light from 1-direction

capture different intensity of the light with better direction resolution

Pin Hole for only projection

It is now a sharp image but throw away lot of info

image-20220202123803652

Lenses

Snell’s Law - looks more shallow than real

image-20220202123940981

So a lens to collect and focus more info…. with the snell’s law.

pupil control the light amount

 

 

image-20220202124102958

image-20220202124315175

Up-side brian process the down-side vision ?????

Check it out.

What our eyes see is actually upside-down..

 

 

image-20220202124551351

How much magnify or reduce the image.

image-20220202124708640

the back of the eyes is not flat..P34

 

Retina

Contains two types of Photorecepetors

image-20220204150317126

image-20220204150547970

Receptive Field

RF is the area on which light must fall for neuron to be simulated

two types of Ganglion cells: : "on-center" and "off-center"

Lecture 1.2 - Human Vision (1).pdf P12 ~ 13

 

image-20220204152139648

some ganglion cells are sensitive with the boundry…

Need more reading of the slides…..?

The rate of firing also tells info.

image-20220204152436769

image-20220204152731652

No.3 Not a total crossover? but a partial crossover. Cuz the brain needs info from both sides.

 

 

Where is the Color?

Three diff types of Cones.

Thrichromatic Coding…

image-20220204153057623

Why so less blue cones?

How to discriminate wavelengths 2nm in difference?

camera has filters allow only one type of color light to go through

Colour Mixing

But some colors do not exist?

One can imaging Bluish-green or Yellowish-green, But NOT Greenish- red or Bluish-yellow!

Many forms of colour vision proposed – Until recently some hard to disapprove •

1930s: Hering (German Physiologist) suggested colour may be represented in visual system as ‘opponent colours’

Yellow, Blue, Red and Green – Primary colours

Opponent Process Coding

Bluish green, yellowish green, orange (red and yellow), purple (red and blue) OK

image-20220204153655853

Excitation and inhibition cancel each other; no change in signal.

We have Red-green Ganglion cell and Yellow-blue ganglion cell.

 

Week 2

Edge Detection

image scale function in matlab

squeeze -> edge more visible

Gradient of the intensity, namely how fast the pixel changing in intensity:

Gx=dfdx, Gy=dfdy,M(G)=Gx2+Gy2a(x,y)=tan1(GyGx)
What is θ=atan2(Gx,Gy)

M is for Magnitude, a is for direction.

Operators or Masks

2 by 2 matrix for the conner:

Gx=[1111], Gy=[1111]

image-20220209122217135

Robert

Gx[1001] Gy[0110]

Sobel

Gx[101202101] Gy[121000121]

Then we can get a gradient matrix, by apply threshold we can get a binary edge image.

Edge value is actually comply Gaussian Ditribution, but can be quite noisy.

If we set up a threshold, we may get multi-border lines. Thus the utilization of Canny.

Gaussian (Canny) edge detection

  1. Apply Gaussian filter to smooth the image in order to remove the noise

  2. Find the intensity gradients of the image, using Roberts, Prewitt, or Sobel, etc.

  3. Apply gradient magnitude thresholding or lower bound cut-off suppression to get rid of spurious response to edge detection

  4. Apply double threshold to determine potential edges

  5. Track edge by hysteresis: Finalize the detection of edges by suppressing all the other edges that are weak and not connected to strong edges.

 

Filtering

Highly Directed Work

 

Mean filter:

random distributed noisy (even out positive and negative noise)

Gaussian Filter:

image-20220211153730108

G2D=G1DG1DT

Laplacian Operator

image-20220211154327251

It is good to have Second Derivative, zero crossing points can be a good edge estimator, but not robust for noise.

IG2dL

 

So G2dL can be a new filter called LoG

Advanced Edge Detection

What cause intensity changes?

image-20220216121320738

Edge Descriptors

Direction - perpendicular to the direction of maximum intensity change (i.e., edge normal)

Strength - related to the local image contrast along the normal

And Position

image-20220216121520636

Main Step in ED

(1) Smoothing: suppress as much noise as possible, without destroying true edges.

(2) Enhancement: apply differentiation to enhance the quality of edges (i.e., sharpening)

(3) Thresholding: determine which edge pixels should be discarded as noise and which should be retained (i.e., threshold edge magnitude).

(4) Localization: determine the exact edge location.

Upsample: sub-pixel resolution might be required for some applications to estimate the location of an edge to better than the spacing between pixels

image-20220216121830295

But it is super noise..

image-20220216121941953

 

h is a Gaussian filter, but sliterly blur my edge

image-20220216122021305

instead conv of h and f, we can also take differentiated G which saves one operation

image-20220216122114369

Prewitt Operator

Gx[101101101] Gy[111000111]

image-20220216122423103

Practical Issue

Noise suppression-localization tradeoff.

– Smoothing depends on mask size (e.g., depends on σ for Gaussian filters).

– Larger mask sizes reduce noise, but worsen localization (i.e., add uncertainty to the location of the edge) and vice versa

image-20220216122521720

We want good localzation and single response.

Canny Edge Detector

image-20220216123128065

image-20220216123149604

image-20220216123252181

image-20220216123442349

I got a thick edge, but not I chose the local maximum of the edge gradient direction.

Non-maxima suppression

Check if gradient magnitude at pixel location (i,j) is local maximum along gradient direction

 

Hysteresis thresholding

Standard thresholding can only select “strong” edges, does not guarantee “continuity”.

image-20220216123859997

image-20220216124005724

 

Scale Invariant Feature Transform (SIFT)

Given the noisy image, design the best suitable algorithm to detect edges.

Given the calculated edges, how would you quantify accuracy?

Why we want to match features?

Tasks like Object Recognition, Tracking…

Types of invariance:

image-20220303142343127image-20220303142237212image-20220303142250380

How to achieve illumination invariance?

How to achieve scale invariance?

image-20220303143403986

image-20220303143703743

Rotation Invariance

If rotation, looking at the histogram: will be same distribution but offset.

 

Handout 4.2

Hough Transform

Polar Space and Cartesian Space

coordinaties

Distance from the origin

image-20220225151010677

image-20220225151137606

 

The Hough transform is a common approach to finding parameterised line segments (here straight lines

The basic idea:

Each straight line in image can be described by an equation (w,ϕ), ϕ for the angle.

Each isolated point can lie on an infinite number of straight lines.

In the Hough transform each point votes for every line it could be on.

The lines with the most votes win.

Hough Space

(w,ϕ)

image-20220225152551399

It also conduct NMS to gain the best edge.

We need to set a threshold A, which is the minPoint to create a line.

A hough map

image-20220225153059790

There are generalised versions for ellipses, circles

For the straight line transform we need to supress non-local maxima

The input image could also benefit from edge thinning

Single line segments not isolated

Will still fail in the face of certain textures

 

Circle Hough Transform

image-20220225153622056

Hough transform technique is that it is tolerant of gaps in feature boundary descriptions and is relatively unaffected by image noise, unlike edge detecto

Lecture 5. Image Registration

Segmentation of Ageing brain

image-20220302120932875

atlas 地图集

 

Co-register the image

image-20220302121040053

 

image-20220302121317303

image-20220302121454461

image-20220302121739243

Landmarks: eyes, ears etc. or curve of features

Image values: conservation of intensity

image-20220302122100956

need same dimension of resolution

hard to handle different features

image-20220302122258805

different pixels value are more likely to belong to different group.

The joint histogram

image-20220302122501565

image-20220302122544856

Class of Transforms:

What similarity criterion to use?

image-20220302123531184

maintain the distances between features.

image-20220302123611968

  1. RMS

  2. Mutual Info

    image-20220302123714703

    maximize the possibility of the location given the pixel.

    what is pi,j ?

    image-20220302123743633

  3. What is Normalised cross-correlation?

 

Computational Vision

 

Characterising images as signals

Image Statistics

Signal-to-noise (SNR)

image-20220304151507565

image-20220304151808754

Non-automated: taking 5~6 and average through.

Histogram-based segmentation

image-20220304150556638

Thresholding challenges

image-20220304152402554

How do we determine the threshold ?

Different regions / image areas may need different levels of threshold.

Many approaches possible

What is the OTSU? #TODO

 

Mathematical Morphology

image-20220304152821808

image-20220304152834298

Dilation

Erosion

Two advanced segmentation methods

 

Active contours (snakes)

image-20220304153144897

image-20220304153154094

image-20220304153213040

Watershed Segmentation

image-20220304153501783

image-20220304153513547

image-20220304153849584

(Active) 3D Imaging and 3D

About touching the world…

image-20220309120343092

Robotic Manipulation

https://www.cs.bham.ac.uk/research/groupings/robotics

image-20220309121543037

3D Imaging

It is hard for people to interpret the first image.

We can use both of them at the same time.

image-20220309121832515

Depth versus distance

image-20220309121928432

How to measure depth and distance?

image-20220309122022691

Passive

Active

Stereophotogrammetry

But hard to process related image (i.e. find the matching pixels)

image-20220309122405711

 

image-20220309122300870

image-20220309122432980

Structure from motion

we have one camera, but moving…

image-20220309122630153

image-20220309123452117

predict where the canvas is, and needs more prior knowledge like the location of the camera.

 

Depth from focus

move the lens that focus..

looking for sharp edges, but not any time that emerges.

image-20220309122845623

It is possible but it is quite noisy.

 

Passive

image-20220309123813432

 

Active Stereophotogrammetry

R200 Camera

Holes if you don’t find correspondence.

image-20220309124338528

TOF

noisy when multiple objects, so we only look at one direction at once.

image-20220309124701611

We now have a wave, so a wave bouncing back..

collect different pixel at different time..

image-20220309125313247

 

 

Dmitry..

Structured Light

image-20220312171222797

image-20220312171458132

image-20220312173308429

image-20220312173934967

Phase wrapping and unwrapping

image-20220312174401909

image-20220312174427902

Photometric stereo

goal is not the depth, but the surfaces…

image-20220312174813723

image-20220312174948739

image-20220312175026363

image-20220312175131814

3D Structure Data

convert depth data into point cloud

image-20220312175228990

image-20220312175236959

image-20220312175248254

Try to find the function to build surfaces (gradient)

image-20220312175347447

Representations: Untextured mesh and textured mesh

image-20220312175510489

image-20220312175521335

Collecting multiple views of a scene (world coordinates)

Robot coordinates

image-20220312175607060

image-20220312175636834

How to combine point cloud?

image-20220312175750995

image-20220312175807276

image-20220312175823028

image-20220312175831685

image-20220312175850854

image-20220312175859645

ICP algorithm

image-20220312175930891

image-20220312175939080

Multi-steps…

image-20220312180016880image-20220312180030879image-20220312180041720image-20220312180055579image-20220312180103211

image-20220312180120023

ICP…

 

Others

image-20220312180143578

 

Principal Components Analysis (PCA)

Covariance

measure of how much each of the dimensions vary from the mean with respect to each other

Covariance Matrix

image-20220316120808560

 

How to interpret covariance?

The value itself that it doesn’t mean anything, but can use to determine the correlation and its sign.

If it is 0: they are independent.

PCA

It can simplify a dataset Rd

It eliminates the later components for reducing dimensianlity.

The dimensions in PCA will be orthonal.

 

 

What is the principal component.

PCA is a useful statistical technique that has found application in:

Basic Theory

image-20220316122431185

Then, we gain the covariance matrix:

image-20220316122548399

N can be the number of pixels in an image.

image-20220316122939584

How much that features contribute, and choose the top-k features.

image-20220316123256574

 

Example

image-20220316123601511

image-20220316123616267

image-20220316123637101

image-20220316123738019

image-20220316123748143

 

Singular Value Decomposition (SVD)

image-20220316124228097

image-20220316124238188

Face Recognition (Not Detection)

Ideas:

Eigenfaces

Think face as a combination of some components of faces.

These basis faces can be differently weighted to represent any faces

So we can use different vectors of weights to represent faces.

image-20220318150856393

How do we pick the set of basis faces?

Statistical criterion for measuring the notion of “best representation of the differences between the training faces”

How to learn?

image-20220318151322938

  1. training set rearrange into 2Dmatrix

    • Rows: Each value, Columns: Each pixel value

  2. Calculate Co-variance matrix

  3. Then find the eigenvectors of that covariance matrix.

  4. Sort by eigenvalues and find the top-features.

  5. Get the principal components vk

Image space to face space.

image-20220318151950175

Recognition in face space

image-20220318152127939

The cloest face in the face space is the chosen match.

But if with hat or glasses ???

Image registration

Some Books

image-20220427121459236

 

Image Matting Problem

image-20220427122141989

 

image-20220429150059601

image-20220429150532191

image-20220429150614080

 

image-20220429151321024

 

image-20220429152721881

Open Source Detectron2 based on PyTorch

image-20220429154132544

Temporal superresolution

image-20220429154342730

image-20220429154441083

image-20220429154600358

image-20220429154837326

image-20220429154953257

image-20220429155103422

image-20220429155427677

image-20220429155549667

image-20220504123501347

 

 

image-20220504124412562

image-20220504124551446

 

 

 

Initialisation:

image-20220506152639141